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(54) Client-server based speech recognition 

(57) A speech communication system comprising a 
speech input terminal and a speech recognition appa- 
ratus which can communicate with each other through 
a wire or wireless communication network is character- 
ized in that the speech input terminal comprises speech 
input unit, a unit for creating environment information for 



speech recognition, which is unique to the speech input 
terminal or represents its operation state, and a com- 
munication control unit for transmitting the environment 
information to the speech recognition apparatus, and 
the speech recognition apparatus executes speech rec- 
ognition processing on the basis of the environment in- 
formation. 
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Description 

FIELD OF THE INVENTION 

[0001 ] The present invention relates to a speech input 
terminal, speech recognition apparatus, speech com- 
munication system, and speech communication meth- 
od, which are used to transmit speech data through a 
communication network and execute speech recogni- 
tion. 

BACKGROUND OF THE INVENTION 

[0002] A speech communication system is proposed, 
in which speech data is sent from a speech input termi- 
nal such as a portable telephone to a host server 
through a communication network, and processing for 
retrieval of specific information and the like are execut- 
ed. In such a speech communication system, since data 
can be transmitted/received by speech, operation can 
be facilitated. 

[0003] However, speech data fluctuate depending on 
the characteristics of a speech input terminal such as a 
portable telephone itself, the surrounding environment, 
and the like, and hence satisfactory speech recognition 
may not be performed. 

[0004] In addition, since communication is performed 
under the same communication conditions under any 
circumstances, high communication efficiency cannot 
always be ensured. 

SUMMARY OF THE INVENTION 

[0005] The present invention has been made in con- 
sideration of the situation associated with a speech input 
terminal, and has as its object to provide a speech input 
terminal, speech recognition apparatus, speech com- 
munication system, and speech communication method 
which can implement optimal speech recognition or 
communication. 

[0006] According to the present invention, there is 
provided a speech input terminal for transmitting speech 
data to a speech recognition apparatus through a wire 
or wireless communication network, characterized by 
comprising speech input means, means for creating in- 
formation for speech recognition, which is unique to the 
speech input terminal or represents an operation state 
thereof, and communication means for transmitting the 
information to the speech recognition apparatus. 
[0007] In the present invention, the information is in- 
formation unique to the speech input terminal or infor- 
mation about the surrounding environment or operation 
state associated with the speaker himself/herself. For 
example, the information includes the characteristics of 
the speech input terminal itself, e.g., the characteristics 
of a microphone for speech input, information about the 
surrounding environment in which the speech input ter- 
minal is used, or the speech features of the person using 



the speech input terminal. This information also includes 
information obtained by performing acoustic analysis 
processing for the original data obtained from the input 
means. 

s [0008] The speech input terminal of the present inven- 
tion can further comprise means for, when a data con- 
version condition for communication based on the infor- 
mation is received from the speech recognition appara- 
tus, converting the speech data on the basis of the con- 

10 version condition. 

[0009] The speech input terminal of the present inven- 
tion can further comprise means for storing the informa- 
tion, means for determining whether there has been a 
change in the information in each communication, and 

15 means for, when there has been no change in the infor- 
mation, notifying the speech recognition apparatus of 
the corresponding information. 

[0010] In the speech input terminal of the present in- 
vention, the terminal further comprises means for cre- 
20 ating a speech recognition model on the basis of the in- 
formation, and the communication means can transmit 
the information and/or the speech recognition model to 
the speech recognition apparatus. 
[0011] According to the present invention, there is 
25 provided a speech recognition apparatus characterized 
by comprising speech recognition means for executing 
speech recognition processing for speech data trans- 
mitted from a speech input terminal through a wire or 
wireless communication network, and means for receiv- 
30 jng information for speech recognition, which is unique 
to the speech input terminal or represents an operation 
state thereof from the speech input terminal, wherein 
said speech recognition means executes speech recog- 
nition processing on the basis of the information. 
35 [0012] According to the present invention, there is 
provided a speech recognition apparatus for executing 
speech recognition processing for speech data trans- 
mitted from a speech input terminal through a wire or 
wireless communication network, characterized by 
40 comprising means for creating information for speech 
recognition, which is unique to the speech input terminal 
or represents an operation state thereof, on the basis of 
the transmitted speech data, and means for executing 
speech recognition processing on the basis of the infor- 
ms mation. 

[0013] The speech recognition apparatus of the 
present invention can further comprise means for cre- 
ating a speech recognition model on the basis of the in- 
formation. 

50 [0014] According to the present invention, there is 
provided a speech recognition apparatus for executing 
speech recognition processing for speech data trans- 
mitted from a speech input terminal through a wire or 
wireless communication network, characterized by 

55 comprising means for receiving information for speech 
recognition, which is unique to the speech input terminal 
or represents an operation state thereof from the speech 
input terminal, means for determining a data conversion 
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condition for communication on the basis of the infor- 
mation, and means for transmitting the data conversion 
condition to the speech input terminal. 
[0015] According to the present invention, there is 
provided a speech recognition apparatus for executing 
speech recognition processing for speech data trans- 
mitted from a speech input terminal through a wire or 
wireless communication network, characterized by 
comprising means for creating information for speech 
recognition, which is unique to the speech input terminal 
or represents an operation state thereof, on the basis of 
the transmitted speech data, means for determining a 
data conversion condition for communication on the ba- 
sis of the information, and means for transmitting the 
data conversion condition to the speech input terminal. 
[0016] In the speech recognition apparatus of the 
present invention, the data conversion condition can in- 
clude a data conversion condition based on a quantiza- 
tion table created on the basis of the information. 
[0017] The speech recognition apparatus of the 
present invention can further comprise means for, when 
the speech input terminal comprises a plurality of 
speech input terminals, storing the information in corre- 
spondence with each of the speech input terminals. 
[0018] The speech recognition apparatus of the 
present invention can further comprise means for, when 
the speech input terminal comprises a plurality of 
speech input terminals, storing the speech recognition 
model in correspondence with each of the speech input 
terminals. 

[0019] The speech recognition apparatus of the 
present invention can further comprise means for, when 
the speech input terminal comprises a plurality of 
speech input terminals, storing the data conversion con- 
dition in correspondence with each of the speech input 
terminals. 

[0020] According to the present invention, there is 
provided a speech communication system comprising a 
speech input terminal and a speech recognition appa- 
ratus which can communicate with each other through 
a wire or wireless communication network, character- 
ized in that the speech input terminal comprises speech 
input means, means for creating information for speech 
recognition, which is unique to the speech input terminal 
or represents an operation state thereof, and communi- 
cation means for transmitting the information to the 
speech recognition apparatus, and the speech recogni- 
tion apparatus comprises means for executing speech 
recognition processing on the basis of the information. 
[0021] According to the present invention, there is 
provided a speech communication system comprising a 
speech input terminal and a speech recognition appa- 
ratus which can communicate with each other through 
a wire or wireless communication network, character- 
ized in that the speech recognition apparatus comprises 
means for creating information for speech recognition, 
which is unique to the speech input terminal or repre- 
sents an operation state thereof, on the basis of speech 



data from the speech input terminal, and means for ex- 
ecuting speech recognition processing on the basis of 
the information. 

[0022] According to the present invention, there is 

s provided a speech communication system comprising a 
speech input terminal and a speech recognition appa- 
ratus which can communicate with each other through 
a wire or wireless communication network, character- 
ized in that the speech input terminal comprises speech 

10 input means, means for creating information for speech 
recognition, which is unique to the speech input terminal 
or represents an operation state thereof, and communi- 
cation means for transmitting the information to the 
speech recognition apparatus, and the speech recogni- 

15 tion apparatus comprises means for determining a data 
conversion condition for communication on the basis of 
the information, and means for transmitting the data 
conversion condition to the speech input terminal. 
[0023] According to the present invention, there is 

20 provided a speech communication system comprising a 
speech input terminal and a speech recognition appa- 
ratus which can communicate with each other through 
a wire or wireless communication network, character- 
ized in that the speech recognition apparatus comprises 

25 means for creating information for speech recognition, 
which is unique to the speech input terminal or repre- 
sents an operation state thereof, on the basis of speech 
data from the speech input terminal, means for deter- 
mining a data conversion condition for communication 

30 on the basis of the information, and means for transmit- 
ting the data conversion condition to the speech input 
terminal. 

[0024] According to the present invention, there is 
provided a speech communication method of transmit- 

35 ting speech data from a speech input terminal to a 
speech recognition apparatus through a wire or wireless 
communication network, characterized by comprising in 
the speech input terminal, the step of creating informa- 
tion for speech recognition, which is unique to the 

40 speech input terminal or represents an operation state 
thereof, and the step of transmitting the information to 
the speech recognition apparatus. 
[0025] According to the present invention, there is 
provided a speech communication method of executing 

45 speech recognition processing for speech data trans- 
mitted from a speech input terminal through a wire or 
wireless communication network, characterized by 
comprising the step of receiving information for speech 
recognition, which is unique to the speech input terminal 

50 or represents an operation state thereof from the speech 
input terminal, and the step of executing speech recog- 
nition processing on the basis of the information. 
[0026] According to the present invention, there is 
provided a speech communication method of executing 

55 speech recognition processing for speech data trans- 
mitted from a speech input terminal through a wire or 
wireless communication network, characterized by 
comprising the step of creating information for speech 
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recognition, which is unique to the speech input terminal 
or represents an operation state thereof, on the basis of 
data transmitted from the speech input terminal, and the 
step of executing speech recognition processing on the 
basis of the information. 

[0027] According to the present invention, there is 
provided a speech communication method of executing 
speech recognition processing for speech data trans- 
mitted from a speech input terminal through a wire or 
wireless communication network, characterized by 
comprising the step of receiving information for speech 
recognition, which is unique to the speech input terminal 
or represents an operation state thereof from the speech 
input terminal, the step of determining a data conversion 
condition for communication on the basis of the infor- 
mation, and the step of transmitting the data conversion 
condition to the speech input terminal. 
[0028] According to the present invention, there is 
provided a speech communication method of executing 
speech recognition processing for speech data trans- 
mitted from a speech input terminal through a wire or 
wireless communication network, characterized by 
comprising the step of creating information for speech 
recognition, which is unique to the speech input terminal 
or represents an operation state thereof, on the basis of 
data transmitted from the speech input terminal, the step 
of determining a data conversion condition for commu- 
nication on the basis of the information, and the step of 
transmitting the data conversion condition to the speech 
input terminal. 

[0029] According to the present invention, there is 
provided a speech communication method between a 
speech input terminal and a speech recognition appa- 
ratus which can communicate with each other through 
a wire or wireless communication network, character- 
ized by comprising, in the speech input terminal, the 
step of creating information for speech recognition, 
which is unique to the speech input terminal or repre- 
sents an operation state thereof, and the step of trans- 
mitting the information to the speech recognition appa- 
ratus, and, in the speech recognition apparatus, the step 
of executing speech recognition processing on the basis 
of the information. 

[0030] According to the present invention, there is 
provided a speech communication method between a 
speech input terminal and a speech recognition appa- 
ratus which can communicate with each other through 
a wire or wireless communication network, character- 
ized by comprising, in the speech recognition appara- 
tus, the step of creating information for speech recogni- 
tion, which is unique to the speech input terminal or rep- 
resents an operation state thereof, on the basis of 
speech data from the speech input terminal, and the 
step of executing speech recognition processing on the 
basis of the information. 

[0031] According to the present invention, there is 
provided a speech communication method between a 
speech input terminal and a speech recognition appa- 



ratus which can communicate with each other through 
a wire or wireless communication network, character- 
ized by comprising, in the speech input terminal, the 
step of creating information for speech recognition, 
s which is unique to the speech input terminal or repre- 
sents an operation state thereof, and the step of trans- 
mitting the information to the speech recognition appa- 
ratus, and, in the speech recognition apparatus, the step 
of determining a data conversion condition for commu- 
te* nication on the basis of the information; and the step of 
transmitting the data conversion condition to the speech 
input terminal. 

[0032] According to the present invention, there is 
provided a speech communication method between a 

15 speech input terminal and a speech recognition appa- 
ratus which can communicate with each other through 
a wire or wireless communication network, character- 
ized by comprising, in the speech recognition appara- 
tus, the step of creating information for speech recogni- 

20 tion, which is unique to the speech input terminal or rep- 
resents an operation state thereof, on the basis of 
speech data from the speech input terminal, the step of 
determining a data conversion condition for communi- 
cation on the basis of the information; and the step of 

25 transmitting the data conversion condition to the speech 
input terminal. 

[0033] According to the present invention, there is 
provided a storage medium recording a program for, in 
order to transmit speech data from a speech input ter- 

30 minal to a speech recognition apparatus through a wire 
or wireless communication network, causing a compu- 
ter to function as means for creating information for 
speech recognition, which is unique to the speech input 
terminal or represents an operation state thereof, and 

35 communication means for transmitting the information 
to the speech recognition apparatus. 
[0034] According to the present invention, there is 
provided a storage medium recording a program for, in 
order to execute speech recognition processing on the 

40 basis of speech data sent from a speech input terminal 
through a wire or wireless communication network, 
causing a computer to function as means for receiving 
information for speech recognition, which is unique to 
the speech input terminal or represents an operation 

45 state thereof from the speech input terminal, and means 
for executing speech recognition processing on the ba- 
sis of the information. 

[0035] According to the present invention, there is 
provided a storage medium recording a program for, in 

50 order to execute speech recognition processing on the 
basis of speech data sent from a speech input terminal 
through a wire or wireless communication network, 
causing a computer to function as means for creating 
information for speech recognition, which is unique to 

55 the speech input terminal or represents an operation 
state thereof, on the basis of the speech data transmit- 
ted from the speech input terminal, and means for exe- 
cuting speech recognition processing on the basis of the 
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information. 

[0036] According to the present invention, there is 
provided a storage medium recording a program for, in 
order to execute speech recognition processing on the 
basis of speech data sent from a speech input terminal 
through a wire or wireless communication network, 
causing a computer to function as means for receiving 
information for speech recognition, which is unique to 
the speech input terminal or represents an operation 
state thereof from the speech input terminal, and means 
for determining a data conversion condition for commu- 
nication on the basis of the information, and means for 
transmitting the data conversion condition to the speech 
input terminal. 

[0037] According to the present invention, there is 
provided a storage medium recording a program for, in 
order to execute speech recognition processing on the 
basis of speech data sent from a speech input terminal 
through a wire or wireless communication network, 
causing a computer to function as means for creating 
information for speech recognition, which is unique to 
the speech input terminal or represents an operation 
state thereof, on the basis of the speech data transmit- 
ted from the speech input terminal, means for determin- 
ing a data conversion condition for communication on 
the basis of the information, and means for transmitting 
the data conversion condition to the speech input termi- 
nal. 

[0038] Other features and advantages of the present 
invention will be apparent from the following description 
taken in conjunction with the accompanying drawings, 
in which like reference characters designate the same 
or similar parts throughout the figures thereof. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0039] The accompanying drawings, which are incor- 
porated in and constitute a part of the specification, il- 
lustrate embodiments of the invention and, together with 
the description, serve to explain the principles of the in- 
vention. 

Fig. 1 is a block diagram showing the arrangement 
of a speech communication system according to an 
embodiment of the present invention; and 
Fig. 2 is a flow chart showing the processing per- 
formed by the speech communication system ac- 
cording to the embodiment. 

DETAILED DESCRIPTION OF THE PREFERRED 
EMBODIMENT 

[0040] Preferred embodiments of the present inven- 
tion will now be described in detail in accordance with 
the accompanying drawings. 

[0041] Fig. 1 is a block diagram showing the arrange- 
ment of a speech communication system according to 
an embodiment of the present invention. 



[0042] The speech communication system is com- 
prised of a portable terminal 100 serving as a speech 
input terminal, a main body 200 serving as a speech rec- 
ognition apparatus, and a communication line 300 for 
5 connecting these components to allow them to commu- 
nicate with each other. 

[0043] The portable terminal 100 includes an input/ 
output unit 101 for inputting/outputting speech, a com- 
munication control unit 102 for executing communica- 
10 tion processing with the main body 200, an acoustic 
processing unit 103 for performing acoustic processing 
for the input speech, an environment information crea- 
tion unit 1 04 for creating information unique to the port- 
able terminal 100 or information indicating its operation 
15 state (to be referred to as environment information here- 
inafter in this embodiment), and a speech communica- 
tion information creation unit 105. 
[0044] The main body 200 includes an environment 
adaptation unit 201 for performing processing based on 
20 the environment information of the portable terminal 
100, a communication control unit 202 for executing 
communication processing with the portable terminal 
1 00, a speech recognition unit 203 for executing speech 
recognition processing for speech data from the porta- 
ls ble terminal 100, a speech communication information 
creation unit 204 for setting data conversion conditions 
for communication, a speech recognition model holding 
unit 205, and an application 206. 
[0045] The sequence of operation of the speech com- 
30 munication system having the above arrangement will 
be described next with reference to Fig. 2. Fig. 2 is a 
flow chart showing the processing performed by the 
speech communication system. 

[0046] The processing performed by the speech com- 
35 munication system is constituted by an initialization 
mode of analyzing environment information and a 
speech recognition mode of communicating speech da- 
ta. 

[0047] In step S401, all processes are started. Infor- 
40 mation for the start of processing is sent from the input/ 
output unit 1 01 to the communication control unit 202 of 
the main body 200 through the communication control 
unit 102. 

[0048] In step S402, a message is selectively sent 
45 from the speech recognition unit 203 or application 206 
to the portable terminal 1 00. When, for example, super- 
vised speaker adaptation based on environment infor- 
mation is to be performed, a list of contents to be read 
aloud by a user is sent and output as a message (speech 
50 or characters) from the input/output unit 101 of the port- 
able terminal 100. When microphone adaptation based 
on environment information is to be performed, informa- 
tion for prompting the utterance of speech for a few sec- 
onds may be output as a message from the input/output 
55 unit 1 01 of the portable terminal 1 00. On the other hand, 
when noise adaptation based on environment informa- 
tion is to be performed, step S402 may be skipped. 
[0049] In step S403, speech data (containing noise) 
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is entered from the input/output unit 101 to create envi- 
ronment information in the portable terminal portable 
terminal 100. 

[0050] In step S404, the acoustic processing unit 1 03 
acoustically analyzes the entered speech data. If the en- 
vironment information is to be converted into a model 
(average, variance, or phonemic model), the informa- 
tion is sent to the environment information creation unit 
1 04. Otherwise, the acoustic analysis result is sent from 
the communication control unit 102 to the main body. 
Note that the speech data may be directly sent without 
performing any acoustic analysis to the main body to be 
subjected to analysis and the like on the main body 200 
side. 

[0051] When the environment information is convert- 
ed into a model in step S404, the flow advances to step 
S405 to cause the environment information creation unit 
104 to create environment information. For the purpose 
of noise adaptation, for example, environment informa- 
tion is created by detecting a non-speech interval and 
obtaining the average and variance of parameters in the 
interval. For the purpose of microphone adaptation, en- 
vironment information is created by obtaining the aver- 
age and variance of parameters in a speech interval. 
For the purpose of speaker adaptation, a phonemic 
model or the like is created. 

[0052] In step S406, the created environment infor- 
mation model, acoustic analysis result, or speech is sent 
from the communication control unit 102 to the main 
body 200. 

[0053] In step S407, the main body 200 receives sent 
the environment information through the communica- 
tion control unit 202. 

[0054] In step S408, the environment adaptation unit 
201 performs environment adaptation with respect to a 
speech recognition model in the speech recognition 
model holding unit 205 on the basis of the environment 
information to update the speech recognition model into 
an environment adaptation speech recognition model. 
This model is then held by the speech recognition model 
holding unit 205. 

[0055] As a method for environment adaptation, for 
example, a PMC technique can be used, which creates 
an environment adaptation speech recognition model 
from a noise model and speech recognition model. In 
the case of microphone adaptation, for example, a CMS 
technique can be used, which creates an environment 
adaptive speech recognition model by using the aver- 
age of speech for adaptation and a speech recognition 
model. 

[0056] In the case of speaker adaptation, for example, 
a method of creating a speaker adaptation model by us- 
ing a speaker adaptation model and speech recognition 
model can be used. If a speech or acoustic analysis re- 
sult is sent instead of an environment information model, 
a method of converting environment information into a 
model and further performing adaptation on the main 
body 200 side can be used. Alternatively, a method of 



performing environment adaptation by directly using a 
speech or acoustic analysis result, EM learning tech- 
nique, VFS speaker adaptation technique, or the like 
can be used as an environment adaptation method. Cre- 

s ating an environment adaptive speech recognition mod- 
el can improve recognition performance. 
[0057] Obviously, a speech recognition model may be 
created on the portable terminal 100 side and sent to 
the main body 200 to be used. 

w [0058] In step S409, in order to improve the commu- 
nication efficiency of speech recognition, the speech 
communication information creation unit 204 performs 
environment adaptation for a table for the creation of 
communication speech information. A method of creat- 
es ing a scalar quantization table of parameters of the re- 
spective dimensions which are used for speech recog- 
nition by using the distribution of environment adaptive 
speech recognition models will be described below. As 
this method, various methods can be used. The simplest 

20 method is a method of searching 3a of the respective 
dimensions for the maximum and minimum values, and 
dividing the interval therebetween into equal portions. 
[0059] The number of quantization points may be de- 
creased by a method of merging all distributions into one 

25 distribution, searching 3a (e.g., a range in which most 
of samples appearing in a Gauss distribution are includ- 
ed) for the maximum and minimum values, and dividing 
the interval therebetween into equal portions. 
[0060] As a more precise method, for example, a 

30 method of assigning quantization points in accordance 
with the bias of all distributions may be used. In this 
method, since a scalar quantization table of the respec- 
tive dimensions is created by using the distribution of 
environment adaptive speech recognition models, the 

35 bit rate for communication can be decreased without de- 
grading the recognition performance, thus allowing effi- 
cient communication. 

[0061] In step S410, the created scalar quantization 
table is transmitted to the portable terminal 100. 
40 [0062] In step 411, the created scalar quantization ta- 
ble is received by the portable terminal 100 and stored 
in the speech communication information creation unit 
105. 

[0063] With the above operation, the initialization 
45 mode is terminated. If a plurality of portable terminals 
1 00 are present, the main body 200 can store data such 
as environment information, speech recognition mod- 
els, and quantization tables in units of portable termi- 
nals. 

50 [0064] The flow then shifts to the speech recognition 
mode. 

[0065] In step S41 2, speech is input through the input/ 
output unit 101 . 

[0066] In step S413, the input speech data is acous- 
55 tically analyzed by the acoustic processing unit 1 03, and 
the resultant data is sent to the speech communication 
information creation unit 105. 

[0067] In step S41 4, the speech communication infor- 
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mation creation unit 1 05 performs scalar quantization of 
the acoustic analysis result on the speech data by using 
a scalar quantization table, and encodes the data as 
speech communication information. The encoded 
speech data is transmitted to the main body 200 through 
the communication control unit 102. 
[0068] In step S415, the main body 200 causes the 
speech recognition unit 203 to decode the received 
speech data, execute speech recognition processing, 
and output the recognition result. Obviously, in this 
speech recognition processing, the previously created 
speech recognition model is used. 
[0069] In step S416, the speech recognition result is 
interpreted by the application 206 to obtain an applica- 
tion corresponding to the result, and the application re- 
sult is sent to the communication control unit 202. 
[0070] In step S417, the application result is sent to 
the portable terminal 100 through the communication 
control unit 202 of the main body 200. 
[0071] In step S418, the portable terminal 100 re- 
ceives the application result through the communication 
control unit 102. 

[0072] In step S41 9, the portable terminal 1 00 outputs 
the application result from the input/output unit 101. 
When speech recognition is to be continued, the flow 
returns to step S412. 

[0073] In step S420, the communication is terminat- 
ed. 

[0074] As described above, in the speech communi- 
cation system of this embodiment, since speech recog- 
nition is performed by using a speech recognition model 
that adapts to the environment information of the porta- 
ble terminal 100, optimal speech recognition can be ex- 
ecuted in correspondence with each portable terminal. 
In addition, since communication conditions are set on 
the basis of environment information, communication 
efficiency can be improved in correspondence with each 
portable terminal. 

[0075] In this embodiment, in the case of noise, the 
average and variance of parameters in a noise interval 
are obtained and sent to the main body to perform noise 
adaptation of a speech recognition model by the PMC 
technique. Obviously, however, another noise adapta- 
tion method can be used. In addition, according to the 
method described above, an average and variance are 
obtained on the terminal side and transmitted. However, 
speech information may be sent to the main body side 
to obtain an average and variance so as to perform 
noise adaptation. 

[0076] With regards to microphone characteristics, 
this embodiment has exemplified the method of obtain- 
ing the average and variance of parameters in a speech 
interval of a certain duration, sending them to the main 
body, and performing microphone characteristic adap- 
tation of a speech recognition model by the CMS tech- 
nique. Obviously, however, another microphone charac- 
teristic adaptation method can be used. In addition, ac- 
cording to the method described above, an average and 



variance are obtained on the terminal side and transmit- 
ted. However, speech information may be sent to the 
main body side to obtain an average and variance so as 
to perform noise adaptation. 
s [0077] This embodiment has exemplified the speaker 
adaptation method of creating a simple phonemic model 
representing user's speech features in advance, send- 
ing it to the main body, and performing speaker adapta- 
tion of a speech recognition model. However, speech 
10 information may be sent to the main body side to per- 
form speaker adaptation by using speech on the main 
body side. Obviously, in this case as well, other various 
speaker adaptation methods can be used. 
[0078] In this embodiment, noise adaptation, micro- 
ns phone adaptation, and speaker adaptation are de- 
scribed independently. However, they can be properly 
combined and used. 

[0079] In this embodiment, the initialization mode is 
to be performed before the speech recognition mode. 

20 Once the initialization mode is completed, however, 
speech recognition can be resumed from the speech 
recognition mode under the same environment. In this 
case, the previous environment information is stored on 
the portable terminal 100 side, and environment infor- 
ms mation created in resuming speech recognition is com- 
pared with the stored information. If no change is de- 
tected, the corresponding notification is sent to the main 
body 200, or the main body 200 performs such determi- 
nation on the basis of the sent environment information, 

30 thus executing speech recognition. 

[0080] In this embodiment, environment information 
is used for both speech recognition processing and an 
improvement in speech efficiency. Obviously, however, 
only one of these operations may be executed by using 

35 the environment information. 

[0081] Although the preferred embodiment of the 
present invention has been described above, the ob- 
jects of the present invention are also achieved by sup- 
plying a storage medium, which records a program code 

40 of a software program that can realize the functions of 
the above-mentioned embodiments to the system or ap- 
paratus, and reading out and executing the program 
code stored in the storage medium by a computer (or a 
CPU or MPU) of the system or apparatus. In this case, 

45 the program code itself read out from the storage medi- 
um realizes the functions of the above-mentioned em- 
bodiments, and the storage medium which stores the 
program code constitutes the present invention. The 
functions of the above-mentioned embodiments may be 

50 realized not only by executing the readout program code 
by the computer but also by some or all of actual 
processing operations executed by an OS (operating 
system) running on the computer on the basis of an in- 
struction of the program code. 

55 [0082] Furthermore, the functions of the above-men- 
tioned embodiments may be realized by some or all of 
actual processing operations executed by a CPU or the 
like arranged in a function extension board or a function 
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extension unit, which is inserted in or connected to the 
computer, after the program code read out from the stor- 
age medium is written in a memory of the extension 
board or unit. 

[0083] As many apparent widely different embodi- 
ments of the present invention can be made without de- 
parting from the spirit and scope thereof, it is to be un- 
derstood that the invention is not limited to the specific 
embodiments thereof except as defined in the claims. 



Claims 

1 . A speech input terminal for transmitting speech da- 
ta to a speech recognition apparatus through a wire ?5 
or wireless communication network, characterized 
by comprising: 

speech input means; 

means for creating information for speech rec- 
ognition, which is unique to said speech input 
terminal or represents an operation state there- 
of; and 

communication means for transmitting the in- 
formation to said speech recognition appara- 
tus. 

2. The terminal according to claim 1 , characterized in 
that the information is based on at least one of a 
characteristic of said speech input means, a noise 30 
characteristic, and a speaker characteristic. 

3. The terminal according to claim 1 or 2, character- 
ized by further comprising means for, when a data 
conversion condition for communication based on 
the information is received from said speech recog- 
nition apparatus, converting the speech data on the 
basis of the conversion condition. 

4. The terminal according to claim 1 ,2 or 3, character- 
ized by further comprising: 

means for storing the information; 
means for determining whether there has been 
a change in the information in each communi- 
cation; and 

means for, when there has been no change in 
the information, notifying said speech recogni- 
tion apparatus of the corresponding informa- 
tion. 

5. The terminal according to any preceding claim, 
characterized in that 



mation and/or the speech recognition model to 
said speech recognition apparatus. 



work; and 

means for receiving information for speech rec- 
ognition, which is unique to said speech input 
terminal or represents an operation state there- 
of from said speech input terminal, wherein said 
speech recognition means executes speech 
recognition processing on the basis of the in- 
formation. 



means for creating information for speech rec- 
ognition, which is unique to said speech input 
terminal or represents an operation state there- 
of, on the basis of the transmitted speech data; 
and 

means for executing speech recognition 
processing on the basis of the information. 



8. The apparatus according to claim 6 or 7, character- 
's ized by further comprising means for creating a 

speech recognition model on the basis of the infor- 
mation. 

9. A speech recognition apparatus for executing 
40 speech recognition processing for speech data 

transmitted from a speech input terminal through a 
wire or wireless communication network, character- 
ized by comprising: 

45 means for receiving information for speech rec- 

ognition, which is unique to said speech input 
terminal or represents an operation state there- 
of from said speech input terminal; 
means for determining a data conversion con- 
so dition for communication on the basis of the in- 
formation; and 

means for transmitting the data conversion 
condition to said speech input terminal. 



6. A speech recognition apparatus characterized by 
s comprising: 

speech recognition means for executing 
speech recognition processing for speech data 
transmitted from a speech input terminal 
10 through a wire or wireless communication net- 



20 7. a speech recognition apparatus for executing 
speech recognition processing for speech data 
transmitted from a speech input terminal through a 
wire or wireless communication network, character- 
ized by comprising: 



said terminal further comprises means for ere- S5 
ating a speech recognition model on the basis 
of the information, and 

said communication means transmits the infor- 



A speech recognition apparatus for executing 
speech recognition processing for speech data 
transmitted from a speech input terminal through a 
wire or wireless communication network, character- 
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ized by comprising: 

means for creating information for speech rec- 
ognition, which is unique to said speech input 
terminal or represents an operation state there- 
of, on the basis of the transmitted speech data; 
means for determining a data conversion con- 
dition for communication on the basis of the in- 
formation; and 

means for transmitting the data conversion 
condition to said speech input terminal. 

11. The apparatus according to claim 9 or 10, charac- 
terized in that the data conversion condition in- 
cludes a data conversion condition based on a 
quantization table created on the basis of the infor- 
mation. 

12. The apparatus according to any of claims 6, 7, 9 or 
10 characterized by further comprising means for, 
when said speech input terminal comprises a plu- 
rality of speech input terminals, storing the informa- 
tion in correspondence with to each of said speech 
input terminals. 

13. The apparatus according to claim 8, characterized 
by further comprising means for, when said speech 
input terminal comprises a plurality of speech input 
terminals, storing the speech recognition model in 
correspondence with each of said speech input ter- 
minals. 

14. The apparatus according to claim 9 or 10, charac- 
terized by further comprising means for, when said 
speech input terminal comprises a plurality of 
speech input terminals, storing the data conversion 
condition in correspondence with each of said 
speech input terminals. 

15. A speech communication system comprising a 
speech input terminal and a speech recognition ap- 
paratus which can communicate with each other 
through a wire or wireless communication network, 

characterized in that said speech input termi- 
nal comprises 

speech input means, 

means for creating information for speech rec- 
ognition, which is unique to said speech input 
terminal or represents an operation state there- 
of, and 

communication means for transmitting the in- 
formation to said speech recognition appara- 
tus, and 

said speech recognition apparatus comprises 
means for executing speech recognition 
processing on the basis of the information. 



16. A speech communication system comprising a 
speech input terminal and a speech recognition ap- 
paratus which can communicate with each other 
through a wire or wireless communication network, 

5 characterized in that said speech recognition 

apparatus comprises 

means for creating information for speech rec- 
ognition, which is unique to said speech input 
10 terminal or represents an operation state there- 

of, on the basis of speech data from said 
speech input terminal, and 
means for executing speech recognition 
processing on the basis of the information. 

15 

17. A speech communication system comprising a 
speech input terminal and a speech recognition ap- 
paratus which can communicate with each other 
through a wire or wireless communication network, 

20 characterized in that said speech input termi- 

nal comprises 

speech input means, 

means for creating information for speech rec- 
25 ognition, which is unique to said speech input 

terminal or represents an operation state there- 
of, and 

communication means for transmitting the in- 
formation to said speech recognition appara- 
30 tus, and 

said speech recognition apparatus comprises 

means for determining a data conversion con- 
35 dition for communication on the basis of the in- 

formation, and 

means for transmitting the data conversion 
condition to said speech input terminal. 

40 18. A speech communication system comprising a 
speech input terminal and a speech recognition ap- 
paratus which can communicate with each other 
through a wire or wireless communication network, 
characterized in that said speech recognition 
45 apparatus comprises 

means for creating information for speech rec- 
ognition, which is unique to said speech input 
terminal or represents an operation state there- 
to of, on the basis of speech data from said 
speech input terminal, 

means for determining a data conversion con- 
dition for communication on the basis of the in- 
formation, and 

55 means for transmitting the data conversion 

condition to said speech input terminal. 

19. A speech communication method of transmitting 
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speech data from a speech input terminal to a 
speech recognition apparatus through a wire or 
wireless communication network, characterized by 
comprising: 

5 

in the speech input terminal, 
the step of creating information for speech rec- 
ognition, which is unique to said speech input 
terminal or represents an operation state there- 
of; and 10 
the step of transmitting the information to the 
speech recognition apparatus. 

20. A speech communication method of executing 
speech recognition processing for speech data ?5 
transmitted from a speech input terminal through a 24. 
wire or wireless communication network, character- 
ized by comprising: 

the step of receiving information for speech rec- 20 
ognition, which is unique to said speech input 
terminal or represents an operation state there- 
of from the speech input terminal; and 
the step of executing speech recognition 
processing on the basis of the information. 25 

21. A speech communication method of executing 
speech recognition processing for speech data 
transmitted from a speech input terminal through a 
wire or wireless communication network, character- 30 
ized by comprising 



transmitted from a speech input terminal through a 
wire or wireless communication network, character- 
ized by comprising 

the step of creating information for speech rec- 
ognition, which is unique to said speech input 
terminal or represents an operation state there- 
of, on the basis of data transmitted from the 
speech input terminal; 

the step of determining a data conversion con- 
dition for communication on the basis of the in- 
formation; and 

the step of transmitting the data conversion 
condition to the speech input terminal. 

A speech communication method between a 
speech input terminal and a speech recognition ap- 
paratus which can communicate with each other 
through a wire or wireless communication network, 
characterized by comprising: 

in the speech input terminal, 
the step of creating information for speech rec- 
ognition, which is unique to said speech input 
terminal or represents an operation state there- 
of; and 

the step of transmitting the information to the 

speech recognition apparatus, and 

in the speech recognition apparatus, 

the step of executing speech recognition 

processing on the basis of the information. 



the step of creating information for speech rec- 
ognition, which is unique to said speech input 
terminal or represents an operation state there- 35 
of, on the basis of data transmitted from the 
speech input terminal; and 
the step of executing speech recognition 
processing on the basis of the information. 

40 

22. A speech communication method of executing 
speech recognition processing for speech data 
transmitted from a speech input terminal through a 
wire or wireless communication network, character- 
ized by comprising 45 

the step of receiving information for speech rec- 
ognition, which is unique to said speech input 
terminal or represents an operation state there- 
of from the speech input terminal; so 
the step of determining a data conversion con- 
dition for communication on the basis of the in- 
formation; and 

the step of transmitting the data conversion 
condition to the speech input terminal. 55 

23. A speech communication method of executing 
speech recognition processing for speech data 



25. A speech communication method between a 
speech input terminal and a speech recognition ap- 
paratus which can communicate with each other 
through a wire or wireless communication network, 
characterized by comprising: 

in the speech recognition apparatus, 
the step of creating information for speech rec- 
ognition, which is unique to said speech input 
terminal or represents an operation state there- 
of, on the basis of speech data from the speech 
input terminal; and 

the step of executing speech recognition 
processing on the basis of the information. 

26. A speech communication method between a 
speech input terminal and a speech recognition ap- 
paratus which can communicate with each other 
through a wire or wireless communication network, 
characterized by comprising: 

in the speech input terminal, 
the step of creating information for speech rec- 
ognition, which is unique to said speech input 
terminal or represents an operation state there- 
of; and 
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the step of transmitting the information to the 
speech recognition apparatus, and 
in the speech recognition apparatus, 
the step of determining a data conversion con- 
dition for communication on the basis of the in- 
formation; and 

the step of transmitting the data conversion 
condition to the speech input terminal. 

27. A speech communication method between a 
speech input terminal and a speech recognition ap- 
paratus which can communicate with each other 
through a wire or wireless communication network, 
characterized by comprising: 

in the speech recognition apparatus, 
the step of creating information for speech rec- 
ognition, which is unique to said speech input 
terminal or represents an operation state there- 
of, on the basis of speech data from the speech 
input terminal; 

the step of determining a data conversion con- 
dition for communication on the basis of the in- 
formation; and 

the step of transmitting the data conversion 
condition to the speech input terminal. 

28. A storage medium recording a program for, in order 
to transmit speech data from a speech input termi- 
nal to a speech recognition apparatus through a 
wire or wireless communication network, causing a 
computer to function as 

means for creating information for speech rec- 
ognition, which is unique to said speech input 
terminal or represents an operation state there- 
of, and 

communication means for transmitting the in- 
formation to said speech recognition appara- 
tus. 

29. A storage medium recording a program for, in order 
to execute speech recognition processing on the 
basis of speech data sent from a speech input ter- 
minal through a wire or wireless communication 
network, causing a computer to function as 

means for receiving information for speech rec- 
ognition, which is unique to said speech input 
terminal or represents an operation state there- 
of from said speech input terminal; and 
means for executing speech recognition 
processing on the basis of the information. 

30. A storage medium recording a program for, in order 
to execute speech recognition processing on the 
basis of speech data sent from a speech input ter- 
minal through a wire or wireless communication 



network, causing a computer to function as 

means for creating information for speech rec- 
ognition, which is unique to said speech input 
s terminal or represents an operation state there- 

of, on the basis of the speech data transmitted 
from said speech input terminal, and 
means for executing speech recognition 
processing on the basis of the information. 

10 

31 . A storage medium recording a program for, in order 
to execute speech recognition processing on the 
basis of speech data sent from a speech input ter- 
minal through a wire or wireless communication 

15 network, causing a computer to function as 

means for receiving information for speech rec- 
ognition, which is unique to said speech input 
terminal or represents an operation state there- 
to of from said speech input terminal; and 

means for determining a data conversion con- 
dition for communication on the basis of the in- 
formation, and 

means for transmitting the data conversion 
25 condition to said speech input terminal. 

32. A storage medium recording a program for, in order 
to execute speech recognition processing on the 
basis of speech data sent from a speech input ter- 
se minal through a wire or wireless communication 

network, causing a computer to function as 

means for creating information for speech rec- 
ognition, which is unique to said speech input 
35 terminal or represents an operation state there- 

of, on the basis of the speech data transmitted 
from said speech input terminal, 
means for determining a data conversion con- 
dition for communication on the basis of the in- 
40 formation, and 

means for transmitting the data conversion 
condition to said speech input terminal. 

33. Computer executable process steps for causing a 
45 programmable computer device to carry out the 

method of any of claims 1 9 to 27. 
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FIG. 2 
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